NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Deep AutoAugment

Zheng, Yu; Zhang, Zhi; Yan, Shen; Zhang, Mi (January 2022, International Conference on Learning Representations (ICLR))

Full Text Available
MimicNet: fast performance estimates for data center networks with machine learning

https://doi.org/10.1145/3452296.3472926

Zhang, Qizhen; Ng, Kelvin K.; Kazer, Charles; Yan, Shen; Sedoc, João; Liu, Vincent (August 2021, Proceedings of the 2021 ACM SIGCOMM 2021 Conference)

Full Text Available
CATE: Computation-aware Neural Architecture Encoding with Transformers

Yan, Shen; Song, Kaiqiang; Liu, Fei; Zhang, Mi (January 2021, International Conference on Machine Learning)
null (Ed.)
Full Text Available
CATE: Computation-aware Neural Architecture Encoding with Transformers

Yan, Shen; Song, Kaiqiang; Liu, Fei; Zhang, Mi (January 2021, International Conference on Machine Learning (ICML'21))
null (Ed.)
Full Text Available
Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?

Yan, Shen; Zheng, Yu; Ao, Wei; Zeng, Xiao; Zhang, Mi (January 2020, Conference on Neural Information Processing Systems)

Existing Neural Architecture Search (NAS) methods either encode neural architectures using discrete encodings that do not scale well, or adopt supervised learning-based methods to jointly learn architecture representations and optimize architecture search on such representations which incurs search bias. Despite the widespread use, architecture representations learned in NAS are still poorly understood. We observe that the structural properties of neural architectures are hard to preserve in the latent space if architecture representation learning and search are coupled, resulting in less effective search performance. In this work, we find empirically that pre-training architecture representations using only neural architectures without their accuracies as labels improves the downstream architecture search efficiency. To explain this finding, we visualize how unsupervised architecture representation learning better encourages neural architectures with similar connections and operators to cluster together. This helps map neural architectures with similar performance to the same regions in the latent space and makes the transition of architectures in the latent space relatively smooth, which considerably benefits diverse downstream search strategies.
more » « less
Full Text Available
Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?

Yan, Shen; Zheng, Yu; Ao, Wei; Zeng, Xiao; Zhang, Mi (January 2020, Conference on Neural Information Processing Systems (NeurIPS'20))
null (Ed.)
Full Text Available
MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution

Yang, Taojiannan; Zhu, Sijie; Chen, Chen; Yan, Shen; Zhang, Mi; Willis, Andrew (January 2020, European Conference on Computer Vision)

We propose the width-resolution mutual learning method (MutualNet) to train a network that is executable at dynamic resource constraints to achieve adaptive accuracy-efficiency trade-offs at runtime. Our method trains a cohort of sub-networks with different widths (i.e., number of channels in a layer) using different input resolutions to mutually learn multi-scale representations for each sub-network. It achieves consistently better ImageNet top-1 accuracy over the state-of-the-art adaptive network US-Net under different computation constraints, and outperforms the best compound scaled MobileNet in EfficientNet by 1.5%. The superiority of our method is also validated on COCO object detection and instance segmentation as well as transfer learning. Surprisingly, the training strategy of MutualNet can also boost the performance of a single network, which substantially outperforms the powerful AutoAugmentation in both efficiency (GPU search hours: 15000 vs. 0) and accuracy (ImageNet: 77.6% vs. 78.6%). Code is available at https://github.com/ aoyang1122/MutualNet
more » « less
Full Text Available
Determining plasmonic hot-carrier energy distributions via single-molecule transport measurements

https://doi.org/10.1126/science.abb3457

Reddy, Harsha; Wang, Kun; Kudyshev, Zhaxylyk; Zhu, Linxiao; Yan, Shen; Vezzoli, Andrea; Higgins, Simon J.; Gavini, Vikram; Boltasseva, Alexandra; Reddy, Pramod; et al (July 2020, Science)
null (Ed.)
Hot-carriers in plasmonic nanostructures, generated via plasmon decay, play key roles in applications like photocatalysis and in photodetectors that circumvent band-gap limitations. However, direct experimental quantification of steady-state energy distributions of hot-carriers in nanostructures has so far been lacking. We present transport measurements from single-molecule junctions, created by trapping suitably chosen single molecules between an ultra-thin gold film supporting surface plasmon polaritons and a scanning probe tip, that can provide quantification of plasmonic hot-carrier distributions. Our results show that Landau damping is the dominant physical mechanism of hot-carrier generation in nanoscale systems with strong confinement. The technique developed in this work will enable quantification of plasmonic hot-carrier distributions in nanophotonic and plasmonic devices.
more » « less
Full Text Available

Search for: All records